185 research outputs found
Randomized Structural Sparsity based Support Identification with Applications to Locating Activated or Discriminative Brain Areas: A Multi-center Reproducibility Study
In this paper, we focus on how to locate the relevant or discriminative brain
regions related with external stimulus or certain mental decease, which is also
called support identification, based on the neuroimaging data. The main
difficulty lies in the extremely high dimensional voxel space and relatively
few training samples, easily resulting in an unstable brain region discovery
(or called feature selection in context of pattern recognition). When the
training samples are from different centers and have betweencenter variations,
it will be even harder to obtain a reliable and consistent result.
Corresponding, we revisit our recently proposed algorithm based on stability
selection and structural sparsity. It is applied to the multi-center MRI data
analysis for the first time. A consistent and stable result is achieved across
different centers despite the between-center data variation while many other
state-of-the-art methods such as two sample t-test fail. Moreover, we have
empirically showed that the performance of this algorithm is robust and
insensitive to several of its key parameters. In addition, the support
identification results on both functional MRI and structural MRI are
interpretable and can be the potential biomarkers.Comment: arXiv admin note: text overlap with arXiv:1410.465
Powering One-shot Topological NAS with Stabilized Share-parameter Proxy
One-shot NAS method has attracted much interest from the research community
due to its remarkable training efficiency and capacity to discover high
performance models. However, the search spaces of previous one-shot based works
usually relied on hand-craft design and were short for flexibility on the
network topology. In this work, we try to enhance the one-shot NAS by exploring
high-performing network architectures in our large-scale Topology Augmented
Search Space (i.e., over 3.4*10^10 different topological structures).
Specifically, the difficulties for architecture searching in such a complex
space has been eliminated by the proposed stabilized share-parameter proxy,
which employs Stochastic Gradient Langevin Dynamics to enable fast shared
parameter sampling, so as to achieve stabilized measurement of architecture
performance even in search space with complex topological structures. The
proposed method, namely Stablized Topological Neural Architecture Search
(ST-NAS), achieves state-of-the-art performance under Multiply-Adds (MAdds)
constraint on ImageNet. Our lite model ST-NAS-A achieves 76.4% top-1 accuracy
with only 326M MAdds. Our moderate model ST-NAS-B achieves 77.9% top-1 accuracy
just required 503M MAdds. Both of our models offer superior performances in
comparison to other concurrent works on one-shot NAS
High oxygen pressure floating zone growth and crystal structure of the layered nickelates RNiO (R=La, Pr)
Single crystals of the metallic Ruddlesden-Popper trilayer nickelates
RNiO (R=La, Pr) were successfully grown using an optical-image
floating zone furnace under oxygen pressure (pO) of 20 bar for
LaNiO and 140 bar for PrNiO. A combination of
synchrotron and laboratory x-ray single crystal diffraction, high-resolution
synchrotron x-ray powder diffraction and measurements of physical properties
revealed that RNiO (R=La, Pr) crystallizes in the monoclinic
2/ (Z=2) space group at room temperature, and that a metastable
orthorhombic phase () can be trapped by post-growth rapid cooling. Both
LaNiO and PrNiO crystals undergo a metal-to-metal
transition (MMT) below room temperature. In the case of PrNiO,
the MMT is found at ~157.6 K. For LaNiO, the MMT depends on the
lattice symmetry: 147.5 K for vs. 138.6 K for 2/. Lattice
anomalies were found at the MMT that, when considered together with the
pronounced dependence of the transition temperature on subtle structural
differences between and 2/ phases, demonstrates a not
insignificant coupling between electronic and lattice degrees of freedom in
these trilayer nickelates.Comment: 21 pages, 8 figures, 3 table
HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis
Pedestrian analysis plays a vital role in intelligent video surveillance and
is a key component for security-centric computer vision systems. Despite that
the convolutional neural networks are remarkable in learning discriminative
features from images, the learning of comprehensive features of pedestrians for
fine-grained tasks remains an open problem. In this study, we propose a new
attention-based deep neural network, named as HydraPlus-Net (HP-net), that
multi-directionally feeds the multi-level attention maps to different feature
layers. The attentive deep features learned from the proposed HP-net bring
unique advantages: (1) the model is capable of capturing multiple attentions
from low-level to semantic-level, and (2) it explores the multi-scale
selectiveness of attentive features to enrich the final feature representations
for a pedestrian image. We demonstrate the effectiveness and generality of the
proposed HP-net for pedestrian analysis on two tasks, i.e. pedestrian attribute
recognition and person re-identification. Intensive experimental results have
been provided to prove that the HP-net outperforms the state-of-the-art methods
on various datasets.Comment: Accepted by ICCV 201
Randomized Structural Sparsity via Constrained Block Subsampling for Improved Sensitivity of Discriminative Voxel Identification
In this paper, we consider voxel selection for functional Magnetic Resonance
Imaging (fMRI) brain data with the aim of finding a more complete set of
probably correlated discriminative voxels, thus improving interpretation of the
discovered potential biomarkers. The main difficulty in doing this is an
extremely high dimensional voxel space and few training samples, resulting in
unreliable feature selection. In order to deal with the difficulty, stability
selection has received a great deal of attention lately, especially due to its
finite sample control of false discoveries and transparent principle for
choosing a proper amount of regularization. However, it fails to make explicit
use of the correlation property or structural information of these
discriminative features and leads to large false negative rates. In other
words, many relevant but probably correlated discriminative voxels are missed.
Thus, we propose a new variant on stability selection "randomized structural
sparsity", which incorporates the idea of structural sparsity. Numerical
experiments demonstrate that our method can be superior in controlling for
false negatives while also keeping the control of false positives inherited
from stability selection
PV-NAS: Practical Neural Architecture Search for Video Recognition
Recently, deep learning has been utilized to solve video recognition problem
due to its prominent representation ability. Deep neural networks for video
tasks is highly customized and the design of such networks requires domain
experts and costly trial and error tests. Recent advance in network
architecture search has boosted the image recognition performance in a large
margin. However, automatic designing of video recognition network is less
explored. In this study, we propose a practical solution, namely Practical
Video Neural Architecture Search (PV-NAS).Our PV-NAS can efficiently search
across tremendous large scale of architectures in a novel spatial-temporal
network search space using the gradient based search methods. To avoid sticking
into sub-optimal solutions, we propose a novel learning rate scheduler to
encourage sufficient network diversity of the searched models. Extensive
empirical evaluations show that the proposed PV-NAS achieves state-of-the-art
performance with much fewer computational resources. 1) Within light-weight
models, our PV-NAS-L achieves 78.7% and 62.5% Top-1 accuracy on Kinetics-400
and Something-Something V2, which are better than previous state-of-the-art
methods (i.e., TSM) with a large margin (4.6% and 3.4% on each dataset,
respectively), and 2) among median-weight models, our PV-NAS-M achieves the
best performance (also a new record)in the Something-Something V2 dataset
Dealing with Non-Stationarity in Multi-Agent Reinforcement Learning via Trust Region Decomposition
Non-stationarity is one thorny issue in multi-agent reinforcement learning,
which is caused by the policy changes of agents during the learning procedure.
Current works to solve this problem have their own limitations in effectiveness
and scalability, such as centralized critic and decentralized actor (CCDA),
population-based self-play, modeling of others and etc. In this paper, we
novelly introduce a -stationarity measurement to explicitly model the
stationarity of a policy sequence, which is theoretically proved to be
proportional to the joint policy divergence. However, simple policy
factorization like mean-field approximation will mislead to larger policy
divergence, which can be considered as trust region decomposition dilemma. We
model the joint policy as a general Markov random field and propose a trust
region decomposition network based on message passing to estimate the joint
policy divergence more accurately. The Multi-Agent Mirror descent policy
algorithm with Trust region decomposition, called MAMT, is established with the
purpose to satisfy -stationarity. MAMT can adjust the trust region of
the local policies adaptively in an end-to-end manner, thereby approximately
constraining the divergence of joint policy to alleviate the non-stationary
problem. Our method can bring noticeable and stable performance improvement
compared with baselines in coordination tasks of different complexity.Comment: 32 pages, 23 figure
Video Generation from Single Semantic Label Map
This paper proposes the novel task of video generation conditioned on a
SINGLE semantic label map, which provides a good balance between flexibility
and quality in the generation process. Different from typical end-to-end
approaches, which model both scene content and dynamics in a single step, we
propose to decompose this difficult task into two sub-problems. As current
image generation methods do better than video generation in terms of detail, we
synthesize high quality content by only generating the first frame. Then we
animate the scene based on its semantic meaning to obtain the temporally
coherent video, giving us excellent results overall. We employ a cVAE for
predicting optical flow as a beneficial intermediate step to generate a video
sequence conditioned on the initial single frame. A semantic label map is
integrated into the flow prediction module to achieve major improvements in the
image-to-video generation process. Extensive experiments on the Cityscapes
dataset show that our method outperforms all competing methods.Comment: Paper accepted at CVPR 2019. Source code and models available at
https://github.com/junting/seg2vid/tree/maste
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
Text-image cross-modal retrieval is a challenging task in the field of
language and vision. Most previous approaches independently embed images and
sentences into a joint embedding space and compare their similarities. However,
previous approaches rarely explore the interactions between images and
sentences before calculating similarities in the joint space. Intuitively, when
matching between images and sentences, human beings would alternatively attend
to regions in images and words in sentences, and select the most salient
information considering the interaction between both modalities. In this paper,
we propose Cross-modal Adaptive Message Passing (CAMP), which adaptively
controls the information flow for message passing across modalities. Our
approach not only takes comprehensive and fine-grained cross-modal interactions
into account, but also properly handles negative pairs and irrelevant
information with an adaptive gating scheme. Moreover, instead of conventional
joint embedding approaches for text-image matching, we infer the matching score
based on the fused features, and propose a hardest negative binary
cross-entropy loss for training. Results on COCO and Flickr30k significantly
surpass state-of-the-art methods, demonstrating the effectiveness of our
approach.Comment: Accepted by ICCV 201
Stacked Charge Stripes in Quasi-Two_Dimensional Trilayer Nickelate La4Ni3O8
The quasi-two-dimensional nickelate La4Ni3O8 (La-438) is an anion deficient
n=3 Ruddlesden-Popper (R-P) phase that consists of trilayer networks of square
planar Ni ions, formally assigned as Ni1+ and Ni2+ in a 2:1 ratio. While
previous studies on polycrystalline samples have identified a 105 K phase
transition with a pronounced electronic and magnetic response but weak lattice
character, no consensus on the origin of this transition has been reached. Here
we show using synchrotron x-ray diffraction on high-pO2 floating-zone grown
single crystals that this transition is driven by a real space ordering of
charge into a quasi-2D charge stripe ground state. The charge stripe
superlattice propagation vector, q=(2/3, 0, 1), corresponds with that found in
the related 1/3-hole doped single layer Ruddlesden-Popper nickelate,
La5/3Sr1/3NiO4 (LSNO-1/3, Ni2.33+) with orientation at 45-degrees to the Ni-O
bonds. Like LSNO-1/3, the charge stripes in La-438 are weakly correlated along
the c axis to form a staggered ABAB stacking that minimizes the Coulomb
repulsion among the stripes. Surprisingly, however, we find that the charge
stripes within each trilayer of La-438 are stacked in phase from one layer to
the next, at odds with any simple Coulomb repulsion argument
- …